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As the title says we want to answer the question; how and why does statistical mechanics work? As 
we know from the most used prescription of Gibbs we calculate the phase space averages of dynamical 
quantities and we find that these phase averages agree very well with experiments. Clearly actual 
experiments are not done on a hypothetical ensemble they are done on the actual system in the 
laboratory and these experiments take a finite amount of time. Thus it is usually argued that actual 
measurements are time averages and they are equal to phase averages due to ergodicity. Aim of the 
present review is to show that ergodicity is not relevant for equilibrium statistical mechanics (with 
Tolman and Landau). We will see that the solution of the problem is in the very peculiar nature 
of the macroscopic observables and with the very large number of the degrees of freedom involved 
in macroscopic systems as first pointed out by Khinchin. Similar arguments are used by Landau 
based upon the approximate property of "Statistical Independence" . We review these ideas in detail 
and in some cases present a critique. We review the role of chaos (classical and quantum) where it 
is important and where it is not important. We criticise the ideas of E. T. Jaynes who says that 
the ergodic problem is conceptual one and is related to the very concept of ensemble itself which is 
a by-product of frequency theory of probability, and the ergodic problem becomes irrelevant when 
the probabilities of various micro-states are interpreted with Laplace-Bernoulli theory of Probability 
(Bayesian viewpoint). 

In the end we critically review various quantum approaches (quantum-statistical typicality ap- 
proaches) to the foundations of statistical mechanics. The literature on quantum-statistical typi- 
cality is organized under four notions (1) kinematical canonical typicality, (2) dynamical canonical 
typicality, (3) kinematical normal typicality, and (4) dynamical normal typicality. Analogies are seen 
in the Khinchin's classical approach and in the modern quantum-statistical typicality approaches. 



"Unless the conceptual problems of a field have been 
clearly resolved, you cannot say which mathematical 
problems are the relevant ones worth working on, and 
your efforts are more than likely to be wasted" 

— E. T. Jaynes. 
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I. INTRODUCTION 

We start the introduction by considering the following 
views of great men: 

" Although nobody is in doubt today of the validity of the 
remarkable interpretation of thermodynamics with which 
statistical mechanics, following the efforts of Boltzmann 
and Gibbs, has recently provided us, it still remains ex- 
tremely difficult to give a completely accurate justifica- 
tion for it" 

— Louis De BrogliepQ. 

"The problem [defining the ensemble distribution de- 
pending upon the external conditions imposed on a sys- 
tem] was solved by Gibbs, although a rigorous justifica- 
tion of the distributions obtained is a complicated prob- 
lem that is still not completely solved at the present time. 
It is not even clear to what extent this rigorous justifica- 
tion is possible" 

— D. N. Zubarevg]. 

" There is probably no other well-established field of the- 
oretical physical science that is as much plagued by para- 
dox and criticism of its fundamental logic as is statistical 
mechanics" 

— Joseph Edward Mayer and Maria Goeppert 
Mayer [5] 



"If we are describing only a state of knowledge about 
a single system, then clearly there can be nothing physi- 
cally real about frequencies in the ensemble; and it makes 
no sense to ask, "which ensemble is the correct one?"... 
Gibbs understood this clearly; and that, I suggest, is the 
reason why he does not say a word about ergodic theo- 
rems,..." 

- E. T. Jaynes0 



There is lot of confusion regarding ergodic hypothesis 
in the literature (regarding whether it is necessary for 
statistical mechanics or not) and in the usual discussions 
of people. Also the reasons how and why statistical me- 
chanics works are not properly understood. The problem 
appears complicated as one has to take into considera- 
tion various ingredients of vary different nature (from the 
character of experimental measurements, large number of 



degrees of freedom, and the microscopic dynamical prop- 
erties) involved in the statistical approach for dynamical 
systems. In the present review we try to dis-entangle var- 
ious ingredients and present the resolution of the ergodic 
problem in a simpler and clear way. 

A detailed study of literature shows that there are 
mainly three camps at the foundations of statistical me- 
chanics (1) ergodic school, (2) non-ergodic school, and 
(3) quantum foundations of statistical mechanics. We 
critically analyse all the three approaches to reach on a 
broader understanding of the foundations of statistical 
mechanics or we attempt to see the "complete picture". 

Aim of the present review is to show that ergodicity 
is not relevant for equilibrium statistical mechanics of 
macroscopic systems (with Tolman and Landau). We will 
see that the solution of the problem is in the very pecu- 
liar nature of the macroscopic observables (they are sum 
functions) and with the very large number of the degrees 
of freedom involved as first pointed out by Khinchin. 
Similar arguments are used by Landau based upon the 
approximate property of " Statistical Independence" . We 
critically review these ideas. We also review the role of 
chaos (classical and quantum) in the foundations of sta- 
tistical mechanics, i.e., where it is important and where 
it is not important. For students and beginners the pur- 
pose of the article is best served if they first understand 
the first chapter of Landau-Lifshitz's book on statistical 
physics and read the little book of Khinchin on the Math- 
ematical foundations of statistical mechanics. Again our 
motivation is to have a broader view of the subject and a 
pedagogical presentation. 

In the present section we present a brief summary of 
the essential topics in the form of questions and answers. 
In section II we review the ergodic approach to the foun- 
dations of statistical mechanics by dividing it into further 
subsections. First defining the ergodic problem and then 
giving a historical account of how the ergodic problem 
came up with the works of Maxwell and Boltzmann. In 
subsection 11(c) the reformulation of the ergodic prob- 
lem by Birkhoff is given and the resolution of the ergodic 
problem by Khinchin is given in subsection D. The role 
of chaos, integrability, and non-integrability is discussed 
in subsection E. 

In section III we present the non-ergodic approach. 
This section consists of two subsections one on Landau- 
Lifshitz's approach and another on E. T. Jaynes approach 
with a critique. We also advance the plausibility argu- 
ments for equal-a-priori probability hypothesis. 

The last section is devoted to the quantum foundations 
of statistical mechanics. In this we review the eigenstate 
thermalization hypothesis, the quantum ergodic theory 
of von Neumann, and other recent quantum-statistical 
typicality approaches. We will see that von Neumann's 
quantum ergodic theorem is a general statement appli- 
cable to systems with many degrees of freedom and the 
eigenstate thermalization hypothesis is a consequence of 
quantum chaos, other approaches falling under typicality 
properties are also analyzed. 
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The issue of compatibility of microscopic time re- 
versibility and macroscopic time irreversibility is not con- 
sidered here. This has been clarified in detail in the 
beautiful papers of Jeol Lebowitz 6J (for irreversibility in 
quasi- isolated systems see[7])- 



A. Is ergodic hypothesis necessary for the 
foundations of statistical mechanics? 

In their famous book[S] Lev Landau wrote: "In the 
discussion of the foundations of statistical physics, 
we consider from the start the distribution of small 

subsystems this allows the complete avoidance of 

Ergodic or similar hypotheses, which are in fact not 
essential as regard to these aims [i.e., the foundations of 
physical statistics]" 

As we know from the most used prescription of Gibbs 
we calculate the phase space averages of dynamical quan- 
tities using appropriate ensembles and we find that these 
phase averages agree very well with experiments. Clearly 
actual experiments are not done on a hypothetical ensem- 
ble they are done on the actual system in the laboratory 
and these experiments take a finite amount of time. Thus 
it is usually argued that actual measurements are time 
averages and they are equal to phase averages due to 
ergodicity, 



f = $£ f J f(.x(t))dt = J p(x)f(x)dx. (1) 

Now we will give the following reasons against ergod- 
icity. 

(1) If one assumes that during the measurement time 
the system samples all the microstates and T — > oo limit 
of equation (1) is reached during the measurement time, 
then ones assumption is wrong. It is a well know fact 
that time taken by the system to sample all the available 
phase is fantastically large (more than the age of the 
universe). For example[9], if we consider a nanoparticle 
with 1000 nuclei each with spin half and consider only 
the nuclear spin system. The total number of quantum 
state are 2 1000 . Consider that these spins continuously 
flip from up spin to down spin and vice versa and this 
spin flipping is caused by phonons from the system and 
bath in which the nanoparicle is present. At room tem- 
perature this frequency is about 10 12 cycles per second. 
Let us assume that each spin makes a transition during 
this period. Thus the total number of transitions per sec- 
ond is 1000 x 10 12 = 10 15 . The time taken by the system 

2 1000 

to go over all the states is -j^- ~ 10 sec, and the age 
of universe is 10 17 sec. Thus clearly during the laboratory 
measurement time the full ensemble is not realized, and 
only a very tiny fraction of the total number of the mem- 
bers of the ensemble is realized. Note that the time scale 



diverges nearly exponentially with the number of atoms 
in the sample. 

(2) If one accepts ergodic hypothesis (and do not ac- 
cept explanation (1)), that the laboratory measurements 
are time averages (measurement time scale being much 
large as compared to microscopic dynamical time scale 
(time taken to flip a spin ~ 10 -12 sec), then one accepts 
equation (1) and accepts T — > oo), then time average 
value of an observable (or a result of measurement) in 
a series of experiments (repeating the same experiment 
again and again) on the same system will agree with en- 
semble average. This is in fundamental contradiction 
with what we observe i.e. thermodynamical measure- 
ments are not strictly reproducible|10) . Fluctuations in 
thermodynamic measurements are the unavoidable con- 
sequence of our lack of control of microscopic dynamics 
and thus our lack of control of the initial micro-condition 
at the start of the measurement. 



B. How does statistical mechanics work and why 
does statistical mechanics work so well? 

The above discussion against ergodicity opens the 
question "but how does statistical mechanics work?" and 
the theory has been so successful. The answer is: 

In statistical mechanics we concern with macroscopic 
systems and with macroscopic observables which are very 
special ones. Special in the sense that they are "sum 
functions" i.e., their value for the whole system is equal 
to the sum of equivalent functions for the small parts 
of the body (for example, energy of the whole system is 
equal to the sum of energies of the small parts of the 
body). Due to the property called "statistical indepen- 
dence" of small parts of the body, the sum functions take 
almost constant value on the energy hypersurface (details 
are given in section 11(D)). 

The microscopic time taken by the measurement pro- 
duces that " same" value. Thus equality of time average 
and phase space average is imposed due to the fact of 
"temporal constancy of special macroscopic observables 
in equilibrium" . The magnitude of fluctuations goes as 
(O 2 ) — (O) 2 oc 1/y/N. Thus these are very small for 
macroscopic systems. 

Put differently, for an extremely large number micro- 
states (called typical micro-states) the macro-state of the 
system is the same. Only very exceptional micro-states 
("bad states" with fantastically small probability) give 
non-typical behaviour. To explain these lines let us con- 
sider a macroscopic system and measure an observable. 
Repeat the experiment many many times, each time the 
starting micro-state is different but important point is 
that each time we get approximately the same value for 
the observable (since huge number of micro-states are 
equilibrium micro-states). It is extremely rare that we 
get a value during a measurement which is very different 
from the value in other measurements. 
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This "typicality" is at the heart of why statistical 
mechanics explain thermodynamic behaviour. Here the 
probability theory inters into statistical description. The 
typicality explains its high predictability for macroscopic 
systems (even though statistical mechanics has proba- 
bilistic basis). 

C. What is the role of microscopic dynamics in 
equilibrium statistical mechanics ? 

As the macro-observables are extremely insensitive to 
micro-condition and microscopic dynamics, the role of 
microscopic dynamics is very insignificant in equilibrium 
statistical mechanics. See section II. D. for a detailed 
explanation, and the role of Fermi-Pasta-Ulam problem 
in this connection (section II. E). 

Also it is good idea to abandon 19th century philoso- 
phy of explaining everything we see around with micro- 
scopic dynamics called "reductionism" . Boltzmann ini- 
tially tried to explain 2nd law of thermodynamics with 
microscopic molecular dynamics but later on realized the 
need of of statistical laws. At each level of complexity 
new laws emerge not explainable "purely" by Shroedinger 
equation [TT]. 

II. ERGODIC APPROACH 

A. The Ergodic Problem 

In standard practice, for example for a system of vol- 
ume V with N particles in equilibrium with very large 
heat bath at temperature T, one use's Gibbs canonical 
ensemble 

Pcan(p,q)= z{p Z(f3,V,N)= J e-P^dpdq. 

'~ ' * (2) 

Here p can {p,q)dpdq is the probability to find the sys- 
tem in the infinitesimal phase volume dpdq around the 
phase point (p,q) [ (p,q) = (p x , p 2 , Pn; qi, q2, qisr) 
for a system of 3N degrees of freedom]. j3 — with 
kg called Boltzmann constant. 

The negative logarithm of partition function Z is pro- 
portional to the free energy of the system 

F(fi,V,N) = -^lnZ(fi,V,N). (3) 

From the free energy and its derivatives all the needed 
information is extracted. Each time we get the Gibbs 
ensemble averaging, for example, pressure is given as 

P = ~(§h,N = -\\ e-^(§)^ N dpdq =<P>. 

(4) 



With Hamiltonian H as a function of the volume V 
(through potential function with boundaries etc.). 

Thus our computational algorithm involves phase 
space averaging. But the actual experiments are done 
on the given system in the laboratory (not on the hy- 
pothetical ensemble). Measurements during the experi- 
mentation take finite amount of time and thus what we 
measure in laboratory is the time averages not the en- 
semble averages. So the immediate question arises how 
to justify the replacement of time averages with ensemble 
average. This is called the ergodic problem. 

There is a considerable confusion regarding the pre- 
cise meaning of ergodic hypothesis. Thus we will present 
a brief historical account about the origin and its mis- 
interpretations by Ehrenfests [T2l [T5] . In this analysis we 
will better understand the ergodic program for justifying 
statistical mechanics. 



B. Historical account: Boltzmann's ergodic 
program 

Boltzmann's first attempt to reduce the second law of 
thermodynamics to a theorem in mechanics appeared in 
1866 and his last papers in columns of Nature in 1890s 
regarding his debates on irreversibility. During his scien- 
tific career (1866-1895) Boltzmann tried to understand 
the kinetic theory of gases and thermodynamics with 
many different approaches based on atomic constitution 
of matter. Favouring one approach at one time and then 
rejecting it for the another and then returning back again 
to the previous one. This is a very distinctive character 
of Boltzmann |14j. Roughly, from 1866-1871 he was inter- 
ested in deriving second law of thermodynamics purely 
from mechanics (in his 1866 paper no probabilistic argu- 
ment is present), in 1867 he reads Maxwell's work and 
his subsequent papers from 1868 to 1871 uses probabilis- 
tic notions after Maxwell's velocity distribution function. 
In these papers he extends Maxwell's results to a gas in 
an external potential (Maxwell-Boltzmann law). In the 
paper[TS] of 1868 he first introduces his ergodic hypothe- 
sis. He was considering an isolated gas in an arbitrary ini- 
tial state ( see Ehrenfests' [H]), then he argues-based on 
the empirical fact that systems tend to equilibrium and 
permanently stay there-that average behaviour over a 
long time interval will the thermal equilibrium behaviour 
(note that this long before his H-theorem). According to 
Ehrenfests here he introduces the concept of ensembles 
(much before Gibbs). In justifying the equivalence of en- 
semble averages and temporal averages he introduces a 
hypothesis (note that he do not use the word ergodic at 
this point, word Ergoden appears much later in his 1884 
paper in a different context). Thus the present day ter- 
minology is largely due to Ehrenfests. Note also that this 
is the birth of modern statistical mechanics. The older 
treatments in which statistical aspects are related to sin- 
gle molecules are called (by Ehrenfests) the "Kineto- 
statistics of the molecule" and the 1868 paper treats gas 
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FIG. 1: Boltzmann in the center. 



model as a whole and Ehrcnfests call it "Kineto-statistics 
of the gas model" . 

Important point to be noted here is that the version 
of ergodic hypothesis which is attributed to Boltzmann 
is stronger than what the founding father has thought: 

" The great irregularity of the thermal motion, and the 
multiplicity of forces that act on the the body from out- 
side, make it probable that the atoms themselves, by 
virtue of motion that we call heat, pass through all possi- 
ble positions and velocities consistent with the equation 
of kinetic energy, and that we can therefore apply the 
equations previously developed to the coordinates and 
velocities of the atoms of warm bodies" 

Now the present strong form (trajectory passing 
through each and every point of the energy surface for 
an isolated system) of ergodic hypothesis cannot be at- 
tributed to Boltzmann, it is the mis-interpretation on 
the Ehrenfests side. The representative phase point of 
the system passing through every point of the phase 
space is impossible. In fact later in 1913 the impossi- 
bility of ergodic systems (stronger form) was proved by 
independently by Rosenthal and Pancherel on measure- 
theoretic arguments. Ehrenfests after mis-interpreting 
the Boltzmann ergodic hypothesis proposed some what 
weaker form and they called it the quasi-ergodic hypoth- 
esis, which meant that the trajectory of phase point cov- 
ers the energy surface densely even though not actually 
passing through every point of it. An excellent discus- 
sion of this mis-interpretation is given in the work of S. 
G. Brush US] . 

In the period 1872-1878 he wrote two most impor- 
tant papers of his life, the 1872 paper contains what we 
now call "Boltzmann equation" and H-theorem. Here he 
claimed that his H-theorem provided the desired theorem 
from mechanics to explain irreversibility and second law 
of thermodynamics. This came under a serious objection 
due to Loschmidt in 1876 who insisted on the incom- 
patibility of the time-asymmetric behaviour (shown by 
H-theorem) and the time-symmetric behaviour of micro- 
scopic equations of motion (These debates are now well 
understood and documented [6]). As a result Boltzmann 
rethought the basis of his approach and presented a very 
different approach in 1877 what we now call permuta- 
tional argument and his famous result S oc InW (see ap- 
pendix A) written in modern form by Max Planck. Equi- 
librium is now conceived as most probable macro-state 



instead of stationary macro-state (in time). It is highly 
probable for a system initially in a non-equilibrium state 
to move towards equilibrium state (real space density dis- 
tribution in accord with external conditions (potentials) 
and Maxwellian velocity distribution) as the phase space 
volume of the equilibrium macro-state is fantastically 
large as compared to non-equilibrium state. Thus the 
equilibrium is most probable state but arbitrary devia- 
tions from it are also probable, but it turns out that the 
probability is fantastically small. This originates the no- 
tions of typicality. 

Third period, in 1880s he left the "purely" probabilis- 
tic approach and again went back to his mechanical ex- 
planation of the laws thermodynamics (we see here that 
Boltzmann did not stick to one approach). He wrote 
an important paper in 1884 largely forgotten by now 
(see the review by Gallavotti[16j. This paper is a gen- 
eralization of a paper by Helmholtz on the mechanical 
models of thermodynamics based on monocyclic systems. 
Boltzmann proves a theorem called heat theorem, i.e., 
du+Pdv ^ s exac t Here T is the time average of the 
kinetic energy K, U = K + $ is the total energy, and 
$ the potential energy which depend on an experimen- 
tally controllable parameter V. The motivation here was 
to find mechanical analogues of thermodynamic entropy. 
Result was proved for monocyclic systems by Helmholtz 
and Boltzmann generalize this for high dimensional sys- 
tems under the assumption of ergodicity (note that this is 
the second time he introduces ergodic considerations) . In 
his discrete world-view he imagined the phase trajectory 
visiting all the discrete cells of the phase space. With 
the introduction of some stationary phase space distri- 
bution (he calls it " monode" ) , the calculation of difficult 
time averages can be replaced with much simpler phase 
averages. He considers an ensemble of systems in exactly 
same macroscopic conditions and a family of stationary 
distributions which he calls "monode" (note that this the 
introduction of ensembles much before Gibbs). A mon- 
ode which verifies the heat theorem is called by him the 
"orthode". Here (first time) he considers two orthodes 
(1) ergode (= microcanonical ensemble) and (2) holode 
(= canonical ensemble), see fig 1. 

As we have seen in section 1(A), in real thermody- 
namic measurements we do not have infinite time aver- 
ages, and thus we are not justified in equating the finite 
time averages with infinite time averages and then us- 
ing the ergodic hypothesis (which has never been proved 
in generality). The answer to this was already clear to 
Boltzmann. The answer lies in the peculiarities of the 
thermodynamic observables and the large number of de- 
grees of freedom involved. We will discuss this in section 
11(D). 

We also note that the original ergodic hypothesis as 
envisaged by Boltzmann was a weaker statement as com- 
pared to modern version of that. 
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Monode (Stationary distributions) 



Orthode (which verify the Heat theorem) 



Ergode (microcanonical ensemble) 



Holode (Canonical ensemble) 

FIG. 2: Boltzmann's 1884 formulation of statistical ensem- 
bles. 



C. Ehrenfests, Birkhoff and the modern ergodic 
theory 

Ehrenfests' 1911 encyclopedia article [I2J generated lot 
of interest about ergodic systems in the community 
of mathematicians (Paul Ehrenfest was the student of 
Boltzmann). 



If one is able to prove the ergodic hypothesis for 
a given system i.e 
/ p(x)f(x)dx., then 



/ = liniT-i-oo y Jq f(x(t))dt = 



*■ one eliminates the necessity of determining initial 
state of the system and of solving Hamilton's equa- 
tions. 

one justifies the dynamical foundation of statistical 
mechanics. [©] 

Clearly, then statistical mechanics reduces to a branch 
of mechanics. But story is not so simple. It is a very 
difficult problem. In 1931, G. D. Brikhoff[T7] reduced 
the ergodic property of a dynamical system to an equiv- 
alent property called " metric transitivity" . He proves the 
following theorems: 



•a / = lim T ^ c 
every x(0). 



T Jo f(U t x(0))dt exists for almost 



A necessary and sufficient condition for the system 
to be ergodic is that the phase space be metrically 
transitive. 

A system is "metrically in-transitive" iff there exists 
regions X\ and X 2 of phase space such that X\ n X2 = 
and X1UX2 = X, which are invariant under the system's 
dynamics: J7*Xi C X x and U t X 2 C X 2 for all t. 

In simple words phase point wanders all the available 
phase space if and only if the system is metrically tran- 
sitive. 

This property again cannot be experimentally verifiable. 




Thus the implications of Brikhoff 's theorems for physical 
statistical mechanics are inconclusive. 

But luckily or unluckily, we have seen that ergodicity 
is not required for doing statistical mechanics of macro- 
scopic systems. 



D. Resolution by Khinchin 

Khinchin|18|(see also the 4th chapter of the book [20 ) 
approached the ergodic problem from physical point of 
view. He pointed out the importance of the following 
points 

1. in statistical mechanical systems of interest the 
number of the degrees of freedom is very large. 

2. in statistical mechanical systems the observables of 
interest are very special ones they are "sum func- 
tions" (f(p, q) = Y^Li fi(Pi, Qi))- 

With these considerations he proves the following two 
theorems: 

Theorem (1): Consider a physical quantity per- 
taining to a single molecule and define phase coeffi- 
cient of correlation: R(s) = MmMMsmM^ , If 
R(s) — > for s — > 00 the function fi(pi,Qi) is ergodic 
(fi(Pi>Qi) {time average) = fi (phase average)). 

The physical assumption that goes into this theorem 
is: "Because of the fact that the given system consists of 
a very large number of molecules it is natural to expect 
that knowledge of the state of a single molecule at a cer- 
tain moment does not permit us to predict anything (or 
almost anything) about the state in which this molecule 
will be found after a sufficiently long time" Khinchin. 

This is equivalent to the idea of molecular chaos. 

Next he proves a much more relevant result (to physical 
statistics): 
Theorem (2) [2D]: 



Prob 



(5) 
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Here / is the actual value of the observable (no time 
averaging !) pertaining to the whole system, and k\ and 
ki are O(l). He proves that the correlation co-efficient 
between phase functions is actually very small for large 
N. 

This implies that the physically relevant observable are 
self averaging! 

The above statement is the resolution of the whole 
problem (for macroscopic systems and sum-function ob- 
servables). 



E. Role of chaos, integrability, and 
non-integrability 

As far as one is concerned with macroscopic systems 
(large N) and sum function observables chaos, integra- 
bility, and non-integrability does not play any role in the 
foundations of statistical mechanics. As usual statistical 
mechanical systems in the laboratory are never isolated, 
the molecular chaos assumption of Khinchin seems to be 
true and the consequences are clear to us. 

But for "isolated" low dimensional systems and non- 
sum-function observables the case is not so simple. Be- 
fore 1955 there was a general consensus that weak extra- 
neous interactions makes the statistical mechanical sys- 
tems ergodic (phase point wandering the whole of phase 
space) . 

In 1955, Fermi and his collaborators did a numeri- 
cal experiment on a chain of non-linearly coupled har- 
monic oscillators. Their expectation was that the small 
non-linear coupling will make the system ergodic (energy 
stored initially in one of the normal modes will go over 
in time to all other modes). But the result was quite 
surprising. Consider the Hamiltonain of the system as 

H = Elo (ll + f " *) 2 + " %) r )- For 

e = system is completely integrable and the en- 
ergy Ef. = h(Qk + ^tQk) 01 eacn normal mode Qk = 
\J Jr J2n=i qn{t)sin{-Kkn/N) is conserved. 

It was expected that when e ^ energy stored initially 
in one normal mode will "spread" to all other modes. But 
this did not happen, the energy periodically comes back 
to the original mode showing no sign of equipartition of 
energy. 

This " paradox" can be explained with KAM theorem. 
Which states that if e is small enough then on the con- 
stant energy surface, invariant tori survive in a region 
whose measure tends to 1 as e — » 0. 

There is a threshold e c called KAM threshold, if (1) 
e < e c KAM tori play a major role and system does not 
follow equipartition, and (2) if e > e c then KAM tori has 
a minor role and equipartition happens. 



FIG. 3: Canonical ensemble in FPU system 

1. Canonical ensemble in FPU system 

In 1987, Livi, Pettini, Ruffo, and Vulpiani published an 
important paper [T5] on the relevance of chaos regarding 
the predictions of canonical ensemble. They divided the 
chain into coupled subsystems containing large number 
of particles (see figure 3). Let U is the per particle time 
averaged energy of the subsystem = E(t)/N su b and C v 
it's heat capacity E ^%~^i^ , T = p 2 . 

It was found that the time averages [47] agrees well with 
the predictions of the canonical ensemble, even though 
the system was having the KAM tori. However the story 
is not so simple. They also considered another model of 
coupled rotators in which only U showed canonical be- 
haviour not C v as the integrability-to- non-integrability 
parameter was changed (see for details chapter 4 of |20) ) . 
Note also that C v is not a sum function. 

Thus one can loosely say that "more coarse your ob- 
servables, less you depend on chaos". 

III. NON-ERGODIC APPROACH 

A. Landau-Lifshitz approach 

As expressed at the beginning of section I (A), Lev 
Landau did not believe in the need of ergodic or simi- 
lar hypotheses in justifying the foundations of statistical 
mechanics [8 . From the beginning they consider a sys- 
tem always present in a bath, i.e., subsystem in a given 
system which is in itself present in bigger system (as in 
nature no system is ideally isolated). They argue that 
the extreme complexity of the interactions of the system 
with the surrounding bodies makes the system's phase 
point to wander in the phase space. Let in a sufficiently 
long interval of time T the phase point spends At amount 
of time in a given volume ApAq of the phase space, then, 



is the probability that, if the system is observed at 
any arbitrary time, its phase point will be found in 
the volume ApAq. Going to the infinitesimals dw = 
PactuaiiP, q)dpdq. Here p ac tuai(p, q) is the density of that 
temporal probability distribution, i.e., p a ctuaiip, q)dpdq is 
the probability to find the system, observed at any time, 
in the infinitesimal phase volume dpdq. 

The important point to be noted here is that we have 
" only one" system under consideration (no statistical en- 



semble is considered at this point). The distribution de- 
fined Pactuai (P) q) is a temporal distribution for that sys- 
tem. 

With the introduction of statistical distribution func- 
tion one can calculate the mean value of any dynamical 
quantity f(p, q) as 



f(j>, q)Pactual(p, q)dpdq. 



(7) 



It is obvious by the definition (6) of probability that the 
statistical averaging (equation (7)) is exactly equivalent 
to the infinite time averaging 



/ 



lim — 

T^oo T 



f(t)dt. 



(8) 



Thus avoiding any ergodic hypothesis, as there is no 
ad-hoc introduction of any probability distribution. The 
Pactuai (p, q) is the actual probability distribution in the 
system's phase space from its temporal behaviour. 

But it is non-trivial to prove that 



lim - / f(t)dt= / f(p,q) 

Pmicrocanonical (p, q)dpdq. 

T^oo T J Q J 

This has never been proved in generality (it turns out 
that this is a very difficult problem for a system with large 
number of degrees of freedom). We will see from Lan- 
dau's argument and with the assumption of "statistical 
independence" that p ac tuai — Pmicrocanonical when num- 
ber of degrees of freedom involved becomes very large. 

Then they argue[8] that predictions of statistical me- 
chanics are very reliable due to the fact that the relevant 
observables take almost constant value on the energy sur- 
face. They also consider "sum" functions / = fi)- 
Their considerations are based on a very general fact of 
"statistical independence" of various parts (pertain to 
subsystems) of the macroscopic observables (it is not true 
for long range interacting systems). On "statistical inde- 



pendence" they prove: ^ °c 



One can say 



Statistical independence 
of Landau — Lifshitz 



molecular chaos 
of Khinchin 



1. Landau's argument: shortest derivation of the canonical 
ensemble 

1. The distribution function of two sub-systems is 
equal to the product of individual sub-systems 
functions (statistical independence) , i.e., px2 = 
P1P2 (for simplicity of notation here p ac tuai = p)- 

2. But log{pi 2 ) = log(pi) + log{p2). Thus log of 
the distribution function is an additive integral-of- 
motion. 



Due to Liouville's theorem (Hamiltonian dynamics) 
distribution function is an integral-of-motion (tem- 
poral invariance of the phase space distribution of 
an ensemble). 

We have only seven independent additive integrals 
of motion E(p,q),P(p,q), and M(p, q)(hom me- 
chanics) . 



log(p) 



■ PEfaq) + 7-PM + S.M(p,q) 



6. From which canonical ensemble ensues p = 
e a+ P E (P'i\ say, for a system in a box. 

The important point to be noted is that there are 
6iV — 7 other integrals of motion (excluding the additive 
ones, i.e., energy, three components of linear and three 
components of angular momentum) . These remaining in- 
tegrals of motion are not additive thus they do not play 
any role in the Landau's argument. 

Thus we see in Lev Landau's program for the foun- 
dations of statistical mechanics, if one starts from the 
beginning with a system present in a bigger system and 
one assumes the property of statistical independence and 
take note of the additive integrals of motion then one can 
reach the canonical distribution and no ergodic hypothe- 
sis is required as we have not constructed any ensemble. 



2. Why the hypothesis of equal- a-priori probabilities is 
plausible? 

We observe that the property of statistical indepen- 
dence is equivalent to the hypothesis of equal-a-priori 
probabilities. We give two reasons: 

Reason (1): There are two views at the foundations 
of statistical mechanics (1) is based on the principle of 
equal-a-priori probabilities (Kubo's book, Tolman's etc) 
and the (2) is based on quasi-closed subsystems and the 
additive integrals of motion (due to L. D. Landau, see 
their book). If one analyze them carefully one sees that 
both view-points are consistent with each other, first one 
comes from the second. 

The principle of equal-a-priori probability is just a 
consequence of the fact that there are mechanical in- 
variants of motion proportional to log of the distribu- 
tion function for a quasi-closed subsystem {E cx log p) 
(due to statistical independence), which is another in- 
tegral of motion due to Liouville's theorem, thus one 
has log(p) — a + l3E suosystem . The right hand side of 
this equation is the content of dynamics (conservation 
laws) [4"5] . The left hand side consists of a more sub- 
tle quantity related to statistical properties and a con- 
sequence of the theorem of dynamics and the property 
of statistical independence. Since E(p, q) is constant no 
matter where is phase point is in the available phase 
space. Consequently log of p, thus p is same for all pos- 
sible (p, q) . This is an equal-a-priori probabilities (see 
figure 4) . As Tolman put it microstate has no preference 
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Statistical independence Mechanical invariants 

\ / 

Hypothesis of Equal-a-priori probabilities 
FIG. 4: The origin of equal-a-priori probabilities 




to be in this or that part of phase space, but with the 
help of statistical independence it is more clearly seen. 

Reason (2): [49] If one assume "statistical independence" 

then Ptotal PsubsystemiPsubsysterri2 Psubsy sterna ' Let 

f total = J2ifc> where = In p subsystemi . Now it is easy 

to prove that Thus f total is almost 

constant on the energy surface E total = E subsystemi + 

E su bsystem 2 + for A^ ^ oo. Which again force us to 

say that ptotal take almost constant value on the energy 
surface-a statement of equal-a-priori probabilities. Thus 
we see that the hypothesis of statistical independence 
directly gives us equal-a-priori probabilities. 

The important point here to be noted is that the above 
reasons are only qualitative observations based on statis- 
tical independence and these are not "rigorous" mathe- 
matical theorems. 



3. Implications of Landau 's argument: equilibrium 
statistical mechanics without any ad-hoc hypothesis and 
without the construction of ensembles 

Most fundamental ingredient: the property of "statis- 
tical independence" . 

If one assume that various parts(macroscopic) of the 
body are statistically independent from each other, then 
one can construct canonical and microcanonical formula- 
tion without any ad-hoc and extra hypothesis (only based 
on statistical independence and some additive mechani- 



cal invariants of motion) . Consider a macroscopic system 
in equilibrium, all its parts (macroscopic) or subsystems 
of the whole system are in equilibrium with each other. 
Let E su b system is the energy of the subsystem. It remains 
constant when the whole system is in equilibrium and the 
subsystem although small as compared to the whole sys- 
tem is still macroscopic and the relative fluctuations in 
Esubsystem are very small (oc ^=). 

One can argue that after a sufficiently long time inter- 
vals the subsystem cannot be considered quasi-closed, as 
the effect of interaction of subsystems, however weak, will 
ultimately appear, and this weak interaction ultimately 
leads to the establishment of statistical equilibrium. We 
will see below that in equilibrium this effect does not ap- 
ply but in non-equilibrium it applies and we do not have 
any "non-equilibrium canonical" formulation. 

Let us consider the subsystem for At amount of time 
(much less than the relaxation time of the subsystem un- 
der consideration). The dynamics is Hamiltonian and 
due the Liouville's theporem log p± is constant of mo- 
tion. With Landau's argument log p\ oc E su bsystem or 
log pi — Oil + Pi Esubsystem- Now let us wait for a 
time interval sufficiently long as compared to the re- 
laxation time. After this, the subsystem is no longer 
quasi-closed and say the distribution is p2 at the end 
of this long time interval. Again consider an interval 
of time At' much less than the relaxation time, dur- 
ing this interval system is again closed and Landau's ar- 
gument again applies logp 2 oc Esubsystem or fog/92 = 
a 2 + ^Esubsystem- From the above two equations we 
have ai - a 2 + (/?i - ^Esubsystem = log j^. Now ther- 
modynamic considerations show that /3 is to be identi- 
fied with inverse temperature of any subsystem of the 
whole system. As the whole system is in equilibrium 
thus all the subsystems of the whole system are in equi- 
librium with each other thus pi — Pi. Also normalization 
conditions demand that e ai — e" 2 = . aE — — ^ . 

/ e p subs B stcm dpd<j 

This gives us pi — pi and the distribution remains sta- 
tionary. Thus in equilibrium Landau's argument applies 
at all times. Moreover p — e a +^ E ^b Sy st sm j s cons tant, 
as RHS of this equation is constant both in time and 
phase (we neglect the fluctuations in E 'subsystem)- Thus 
p = constant, which is microcanonical ensemble. 

On similar lines with Landau's argument we can 
proceed to show the canonical distribution p — 

e a+PE aubBystern _ JJ ere we allow fluctuations in E subsystem 

but keep P same for all subsystems. 

The most important point to be noted here is that we 
do not "mentally construct" an ensemble and we do not 
assume equal-a-priori probability hypothesis. All the sta- 
tistical formulation comes out from the property of "sta- 
tistical independence" and the few additive integrals of 
motion (note that statistical independence does not apply 
to long range interacting systems). All other integrals of 
motion does not play any role in the Landau 's argument 
because they are not additive. 
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4- Limitations of Landau- Lifshitz's approach 

(1) Temporal definition of probability: In a sufficiently 
long interval of time T the phase point spends At amount 
of time in a given volume ApAq of the phase space, thus 
w = limj^oo 4r is the probability that, if the system is 
observed at any arbitrary time, its phase point will be 
found in the volume ApAq. 

In the above definition, non-stationary case cannot be 
defined, as the probability to find the system in ApAq it- 
self change with time. Thus the above definition excludes 
ubiquitous non-equilibrium cases. 

(2) As Landau-Lifshitz's whole analysis is based upon 
quasi-closed subsystems, the relation log(p) — a + 
f3E(p, q) +7.P(p, q) + S.M.(p, q) holds good only for "not 
too long intervals of time" . As after a sufficiently long 
time intervals the subsystem cannot be considered quasi- 
closed. In their own words "Over a sufficiently long in- 
terval of time (compared to the relaxation time), the ef- 
fect of interaction of subsystems, however weak, will ul- 
timately appear. Moreover, it is just this relatively weak 
interaction which leads finally to the establishment of sta- 
tistical equilibrium". We have seen that in equilibrium 
Landau's argument applies at all times, but when the 
system is not settled to equilibrium (temperature equi- 
librium) we cannot apply Landau's argument due to the 
above reason. 



B. E. T. Jaynes' approach 

There are two schools at the foundations of probabil- 
ity theory (1) frequentists (probability from many ran- 
dom experiments), and (2) Bernoulli-Bayes-Laplacians or 
simply Bayesians (Principle of indifference). 

Jaynes' standpoint: The crgodic problem is a concep- 
tual problem related to the very concept of ensemble itself 
which is a byproduct of frequency theory of probability. 
Ergodic problem becomes devoid of any meaning when 
the probabilities of various micro-states are interpreted 
with Bernoulli-Bayes-Laplace theory of Probability. In 
the Bayesian theory of probability, probabilities of occur- 
rence of events are independent of the frequency concept. 
It is a more general viewpoint in which frequency theory 
is a special case and is based upon the principle of In- 
deference. Thus if we do not visualize the probability of 
occurrence of a micro-state with frequency (or construct 
ensemble) there is no question of ensemble averaging and 
ergodic problem is completely bypassed. 

Then the statistical mechanical theory is attacked with 
statistical inference (Shannan entropy approach). In 
Jaynes' words pT]: 

"We can have our justification for the rules of statistical 
mechanics, in a way that is incomparably simpler than 



anyone had thought possible, if we are willing to pay the 
price. The price is simply that we must loosen the con- 
nections between probability and frequency, by returning 
to the original viewpoint of Bernoulli and Laplace. The 
only new feature is that their principle of Insufficient rea- 
son is now generalized to the principle of Maximum En- 
tropy." 

For more details see his detailed account pi]. 



1. Critique of Jaynes approach 

The present author feels that it does not matter 
(atleast in the computational problems of statistical me- 
chanics) that which theory of probability one is accept- 
ing. But there is a disadvantage if we attack statistical 
mechanical problem with Bayesian viewpoint. 

If one accepts the Bayesian viewpoint, then one aban- 
dons the concept of ensembles. The probabilities of the 
microstates are obtained by maximizing the Shannon's 
entropy (with the given amount of macroscopic infor- 
mation). But this also means that one is neglecting 
the dynamical properties of constituents of matter (al- 
though they are not so important for sum-functions, a-la 
Khinchin). Obviously molecules of a gas obey Newton's 
laws or Schroedinger's equation. The phase point of an 
isolated system has no preference for this region or that 
region of phase space (the phase point of an isolated sys- 
tem visits all the accessible regions of phase space, with 
equal frequency although time involved are fantastically 
large). 

It is also the fact that in macroscopic observables 
(which are so coarse on microscopic scale) the micro- 
scopic dynamical properties do not reflect in all its de- 
tail, only very general microscopic dynamical properties 
reflect. For example, additive mechanical invariants play 
the essential role (in Landau's approach). Thus neglect- 
ing the dynamical properties of the constituents alto- 
gether is not a valid starting point. Thus it seems difficult 
to describe the integrable to non-integrable transition in 
dynamical systems with Jaynes approach. 

As we have seen that the sum-function macroscopic 
observables are so weakly depend on the dynamics of 
the phase point and thus this is the resolution (with 
Khinchin), not with the Bayesian viewpoint, which makes 
ergodic problem devoid of any meaning. Historically it 
is to be noted that Prof. G. Uhlenbeck objected Jaynes 
ideas [21]. 



IV. QUANTUM FOUNDATIONS OF 
STATISTICAL MECHANICS 

Since in the quantum case position and momentum 
do not commute with each other, the classical concept 
of phase space that helps us to envisage the dynamics 
of the phase point on the energy surface breaks down. 
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However an equation analogous to equation (1) can be 
written in the quantum case also. The equivalent of 
"phase space energy shell" in the quantum case is the 
sub-Hilbert space (Hv) of the total Hilbert space H. T-L u 
is defined as the space spanned by all <j) v (eigenstates of 
the system's Hamiltonian) such that their corresponding 
eigenvalues E v are in a specified interval of energy from 
E to E + AE. The quantum equivalent of the classical 
ergodic hypothesis (equation (1)) is written as 



Eigenstate Thermalization Hypothesis (ETH) 



lim 



(A(t))dt 



E i 

a:E a £E,E+AE 



\<f> a \A\<t> a ), (10) 



for a system in pure quantum state (for details see 
equation (11) below). For a system in mixed quantum 
state replace \c a \ 2 — > p aa with p aa as the diagonal ele- 
ment of the density matix in the energy representation. 

Here again we encounter (LHS of the above equation) 
the problem of infinite time averaging, and all the ar- 
guments of section 1(A) are valid, but we will see below 
that the "resolving results 7 ' analogous to the Khinchiris 
self averaging theorem hold in quantum case also-either 
based upon quantum chaos or based on von Neumann's 
"scale separation" ideas. We again see the fundamental 
role of typicality and large dimensional Hilbert spaces. 



A. Eigenstate Thermalization Hypothesis (ETH) 

Eigenstate Thermalization Hypothesis (ETH) implies 
that the thermalization happens at the level of individual 
eigenstates. Consider a system with Hamiltonian H and 
eigensystem E a , <p a prepared in some initial state. If 
system has unitary evolution, then at any later time t 



Quantum mechanical mean of an operator A pertain- 
ing to the system is: 

a, p 

Consider infinite time average 



(A(t)) = ^ c a c pfia,pA at p = 2J \c a \ 2 A a:a 
a,f) a 
i-T 



S a .B = lim 



1 



T->oo T 



,i{E a - EfS )t dt \ (n) 



^l a \ c a\ 2 A aa is also called the diagonal ensemble 
in|22 . Now the thermodynamic universality demands: 



General theory is seriously lacking 



Proved in special circumstances 

Integrable Hamiltonian + weak Random Matrix (GOE) 

Systems which are classically chaotic 
ETH follows in the semiclassical limit [A. Voros, 1979] 

ETH follows from Berry's conjecture 
(low density billiards, Srednicki, 1994) 



Berry's conjecture: [eigen wavefunctions are random Gaussian variables 
for a classically chaotic quantum system.] 

FIG. 5: Status of ETH 



LONG TIME AVERAGE = AVERAGE OVER AP- 
PROPRIATE STATISTICAL ENSEMBLE (MICRO- 
CANONICAL OR CANONICAL ETC.) 



a 

1 



rnicrocan at E—Eq 



N, 



E ,AE 



E 



A n 



(12) 



a;\E -E a \<AE 



This is an universality relation with LHS depends on 
initial conditions but RHS does not ! 

Now Eigenstate Thermalization Hypothesis (ETH) 
says that there are no eigenstate-to-eigenstate fluctua- 
tions in A a>a 's for eigenstates close in energy 

—5* A a a ^ \Ca\ ^-Q,a ~T7 ^ A a a 



a£Window 



Ne ,AE 



This is called the eigenstate thermalization hypothesis 
(ETH) by Deutsch (1991) and Srednicki (1994) [2S1 |5B]: 

"The quantum expectation value (ip a \ A\ip a ) in a large 
interacting many-body system is equal to the thermal 
average" 



{1) a \A\4 a ) = {A) microcan. around E a 



(13) 



If this is valid in generality, then one explain thermody- 
namic universality. Also if the system obeys Berry's con- 
jecture (eigenfunctions are random Gaussian variables for 
a classically chaotic quantum system) the off-diagonal el- 
ements ((ip a \A\ipp), a (3) are negligible and then one 
obtains thermal behaviour without any time averaging at 
a/^26 . Also it has been numerically shown [22] that the 
magnitude of the off-diagonal elements of the momen- 
tum distribution operator are very small as compared 
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to the diagonal elements A a> p\ a ^p « A aa . Clearly 
much work is needed to show ETH for other operators 
and importantly for sum-function operators (across the 
integrability-to-(non)integrability transition). For de- 
tailed discussion see (55] and the ref[231 [21] for the be- 
haviour of diagonal and off-diagonal elements of various 
other observables. 

Study of literature shows that ETH has been proved 
analytically only in special circumstances (see figure 5). 
Figure shows that some kind of chaos is necessary for 
ETH to hold. This is also evident in the numerical ex- 
periment of Rigol etal[22] for a small system (five hard 
core Bosons on a lattice) that ETH hold good in the 
non-integrable system but fails in the integrable one (for 
marginal momentum distribution as the observable). 

Clearly systems with very small number of degrees- 
of-freedom cannot be fit into Khinchin's program (which 
requires macroscopic systems). This implies that the sta- 
tistical mechanical universality (irrelevance of chaos) that 
we enjoy with macroscopic systems and sum function ob- 
servables may no longer holds good for small isolated 
quantum systems. 



state to the equilibrium state (provided the system satisfy 
some conditions). 

Thus the total Hilbert space is partitioned in sub- 
spaces H.V and the macroscopic observables take specific 
values in each "H„ (values taken in are different from 
those taken in when v ^ /z). The "rounded" opera- 
tors commute with each other but they do not commute 
with the Hamiltonian (this is analogous to non-integrals 
of motion in classical Gibbs picture). 

Qualitative statement of QET: 

Let P v be the projection to with macro-state v 
having some values of observables. 

Then QET says: 

For all initial wavefunctions -00 € H, IIV'oll = 1 
(note that this includes the special initial non-equilibrium 
states), we have for most of the time 



\PM\ 2 = {^t\P v \ipt) tr{p mc P v ) 



D 



(14) 



B. von Neumann's quantum ergodic theorem (or 
the better notion of " normality" due to 
Goldstein-Lebowitz-Tumulka-Zanghi (GLTZ)) 

We present here the qualitative statement of von Neu- 
mann's quantum ergodic theorem, for a precise definition 
see [37]. Our aim here is to understand the physical ba- 
sis of the theorem, to draw some analogies with the ap- 
proach by Khinchin, and to see the connection with the 
ETH (eigenstate thermalization hypothesis). 

Setup: Consider a system with Hamiltonian H and to- 
tal Hilbert space W. Consider that the total Hilbert space 
is partitioned into mutually orthogonal sub-Hilbert- 
spaces H v , with family V (V = {H v }, H = 0%*.). 
Let d v — dimH. v and D = dirnH. Let total number of 
partitions is n i.e., n — 

The physical basis of this partitioning (according to 
von Neumann) is that each % v belongs to a different 
macro-state (or macro-observer). This is the crucial in- 
sight of von Neumann, as in quantum theory all opera- 
tors corresponding to observables do not commute among 
themselves, von Neumann take a physical stand point 
that the measurement process is coarse on microscopic 
variables, thus operators can be taken "approximately" 
commuting (measurement errors are large as compared to 
bounds of uncertainty relations). This is called "round- 
ing" of operators. 

The aim of the Quantum Ergodic Theorem (QET) is 
to tell us some kind of universality or "normality" that 
the quantum expectation values of operators are close to 
the microcanonical averaging for most of the time (under 
some conditions, see below). It is a kind of quantum H- 
theorem which says that for all initial states the time 
evolution take the system from initial non-equilibrium 



In words: probability distribution over macro-states 
is approximately equal to the ratio of the dimension of 
Hilbert space corresponding to that macro-state to the 
dimension of the total Hilbert space. Thus the macro- 
state with largest d v is most probable (which is in accord 
with common sense, equilibrium state has the largest 
d v = d eq and it is typically observed). 

Conditions involved in QET are: 

1. Hamiltonian is free of resonances i.e., E a — Ep ^ 
E a i — Epi, unless either a — a' , ft = /3' or a = 
P, a> = p>. 

2. The quantity f v {H,V) = max a ^\{(p a \P v \^}\ 2 + 
max a {{(j) a \P v \4i a ) — jy) 2 is small for all v. The 
meaning of small here is that the quantity /„ (H, T>) 
is smaller than e 2 -% — . Here e and 5' are small 

nJJ n 

positive numbers and n is the number of parti- 
tions n = #1?. For more details see[27] and En- 
glish translation of the original von Neumann's 
article [25]. 

This is QET. This property should be better called 
"normality" after GLTZ. GLTZ also give a stronger 
bound on the deviations from the aver age [2"j?j. The no- 
tion of " normality" is a special case of a broader notion of 
" typicality" . There are various kinds of normality prop- 
erties as discussed in the next subsection. Strict deter- 
minism of classical mechanics is replaced by the notion 
of "typicality" of statistical mechanical systems. The 
notion of typicality captures the almost deterministic be- 
haviour of statistical mechanical systems (although atyp- 
ical behaviour is possible in principle but highly improb- 
able) nag. 
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1. Remarks on ETH and QET 

1. QET is a typicality result (typicality due to "scale 
separation" (see DNT below)) valid for large sys- 
tems (i.e., containing large number of particles) 
but ETH (again a typicality result-typicality due 
to quantum chaos) has been applied even for few 
boson atoms in an optical lattice. 

2. ETH involves quantum chaos and QET docs not 
involve quantum chaos ! 

3. ETH has been proved (analytically) only in special 
circumstances based on the assumption of quantum 
chaos 123], see figure (5). General theory is seriously 
lacking. 



C. Other approaches 

Recent literature can be organized under the following 
four notions (see figure 6): 

1. Kinematical canonical typicality (KCT) 

Qualitative definition: We know that a small system 
weakly coupled to a large bath is described by canonical 
ensemble when the composite system (system + bath) 
is described by microcanonical ensemble. Kinematical 
canonical typicality says that even if the state of a com- 
posite system is pure (rather than a mixture as in the 
traditional case) the reduced density matrix of the sys- 
tem is canonical. 

KCT has been announced in[30] extending the works 
of Schrodinger[3"T]. Related works appeared in[321 155] . 
In J3U] the basic assumption is that the probability 
distribution is uniform over all normalized wavefunc- 
tions \& with energy in the shell [E, E + S] of the com- 
posite system. The system density matrix (when the 
composite is in pure state) p* = fr- s |\I r )( , I r | is ob- 
tained by tracing out bath degrees of freedom. The 
crucial role in the tracing is played by the expan- 

sion co-efficients Cy in = || y-'' 3 c'j|g' s )|g?)|| wn i cn 
were assumed Gaussian Random variables. It is shown 
that p* is approximately canonical (see for details[50]). 
A rather general treatment of KCT is given in|36j. 
They prove KCT using Levy's Lemma which is anal- 
ogous to the law of large numbers. They prove that 
(£>(/?*, n s )) < Lf^, where D(p*,CL s ) is the trace 

V °E 

distance \tr^J (p* — fis)t(p* — f2g) between p* (re- 
duced density matrix of the system when the compos- 
ite is in the pure state) and Qs (the traditional (when 
composite is in the mixed state) reduced density matrix 
of the system). The average (...) is over the environ- 
ment states with the standard unitarily invariant mea- 
sure, ds is the dimension of the system's Hilbert space 



and dg* — l/tr{VL 2 E ) is a measure of the effictive size of 
the environment d E > ^ (d,R is the dimension of the 
environment's Hilbert space). Thus when bath or envi- 
ronment is much larger than the system i.e., dn » ds 
then d e Jf >> 1 and if ds/d e J* is very small then p* 
and f^s are close to each other, (D(p iSr ,Qg)) < \\ H^j, 

V "■E 

which is a statement of KCT (for details and related re- 
sults see [31)]). 

2. Dynamical canonical typicality (DCT) 

Qualitative Definition: When the composite system 
(system + Bath) is in a pure state, the expectation value 
of an operator pertaining to the system, after sufficiently 
large time, will be almost equal to the canonical expec- 
tation value. 

DCT has been proved in|33j and there were numer- 
ical experiments 37J for its evidence. In [33 DCT has 
been proved with an assumption of weak coupling be- 
tween system and bath with Ae >> A >> AB (Ae is 
the minimum spacing between the energy levels of the 
system, A is the magnitude of system-bath coupling, and 
AB is the maximum spacing between the energy levels 
of the bath). Also a hypothesis of "equal weights for 
eigenstates" is used which makes the energy expansion 
co-efficients small. The statement of DCT goes like this. 
Suppose that the composite-system is in the pure state 
\^{t)) at any time t, the expectation value of an operator 
of the system is written as (A) t = (&(t)\(A ® 7 B )|*(t)). 
Then DCT says, after sufficiently long time, (A) t ~ 

tr s {Ae^ s) p Qr p r00 f see [33]. DCT has been proved in 

a much more general setting in|34] and also in [35] using 
an assumption that matrix elements of the interaction 
Hamiltonian has random phases and then constructing 
Markovian master equations. In [31], the authors extend 
their previous kinematical results (last subsection). They 

prove that (D\p s (t), u> s ])t < l^/i^fe < \ \fsS^) ■ 
Here ps{t) = tr B[ptotai(t)] is the system density matix ( 
Ptotai(t) = I^WK^WI is the total density matrix (sys- 
tem + Bath)) and uj s = (ps(t))t = lim^oo J Q T p s (t)dt. 
In words, theorem says that the time average of the trace 
distance (see previous subsection) between instantaneous 
system's density matrix and its time averaged density 
matrix is small if ds « d e ^ (u>b)- This means that sys- 
tem spends almost all of its time near to u>s- This is called 
equilibration (weakly fluctuating about a steady state — 
this steady state need not be an equilibrium state). The 
important point is that this result is completely general 
(except the assumption that the Hamiltonian is free of 
resonances-condition no (1) in von Neumann's theorem). 
It does not depend upon the nature of system-bath cou- 
pling, condition of bath (whether it is in equilibrium or 
not). They also treat the problem of thermalization (i.e., 
initial state independence, as the steady state reached 
may depend on the initial conditions) . For a detailed ac- 
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- Canonical Typicality: 

A subsystem with weak interaction 
with a bath is typically canonically 
distributted when the whole is 
microcanonically distributed 
(large deviations are rare from 
the canonical state) 

Microscopic reversibility 

and macroscopic irreversibility: 
Macroscopic irreversibility is the 
typical behaviour of macroscopic 
systems, in contrast to systems 
with small number of degrees of freedom. 

Khinchin's self averaging theorem 



I 

Canonical Typicality (CT) 



■ Kinematical CT : 

If the state of a composite system 
is pure (rather than mixture) 
the reduced density matrix 
of the susbsystem is canonical 



Dynamical CT : 

Almost any subsystem in interaction 
with a large enough bath (S+B in pure state)will 
an equilibrium state (after some relaxation 
time scale) and remain close to it for 
almost all times. 



reach 



Normal typicality (NT) 



- Kinematical NT : 

For an isolated quantum system expectation 

value of an operator in a pure state is close to 

its microcanonical averaging or general averaging. 

-Dynamical NT : 

For a "typical" family of commuting 
Macroscopic observables of an isolated system, 
every initial wavefunction 
from a microcanonical energy shell so evolves that 
for most times in the long run, the joint probability 
distribution of these observables obtained from the 
time evolved state is close to their micro-canonical 
distribution. 



Equipartition of energy 
in normal modes of FPU 
chain above the KAM 
threshold 



Quantum chaos 

- Wigner's Random Matrix 
Theory and its generalizations 

Asher Peres' approach 
" ETH (Eigenstate Themalization) 

(Most important case of chaos 
induced typicality) 



FIG. 6: Typicality of statistical mechanical systems 



count and further results, see their very nice paper [34]. 
Our purpose here is to introduce various typicality theo- 
rems and to give the excitement to the reader. Interested 
reader should go to the original literature. 

3. Kinematical Normal typicality (KNT) 

Qualitative Definition: For an isolated quantum sys- 
tem in an arbitrary pure state, the quantum expectation 
value of an operator in that state hardly deviates from the 
ensemble average. The meaning of "hardly" is clarified 
below. 

A clear account of KNT is given in [35]. The state- 
ment of KNT theorem there goes like this; Con- 
sider that the system is in some pure state |\I>) = 
^2 / Cn \ n )i with \n) as eigenstates. The quantity 

a\ = - (*|yl|vl')) 2 is small. Here the av- 

erage WW) = J(*\A\iS>) P (c)l\% =1 dRe(c n )dIm(c n ), 
where p(c) is the probability distribution over the ex- 
pansion co-efficients. The meaning of "small" in the 
above statement is that, a\ < &. 2 A (maxq n )(trp 2 ), 
here q n 's are positive numbers typically of order 
one, and is the difference between the maxi- 

mum and minimum eigenvalues of A. For K opera- 
tors one has Probability[maxi ii^liM^hiffliir^ll > e] < 
K{max n q n ){ti p ) ^ ^y nere f or j£ p era tors i runs from 1 to 
K, and is the difference between the maximum and 



minimum eigenvalues of Ai . 

The smallncss of <j\ (which is must for typicality) is 
imposed due to the conditions involved in the above the- 
orem; 

1. The expansion co-efficients c n are statistically in- 
dependent and c„, cjf™ are equally likely for arbi- 
trary phase 4> n . Thus p(c) = J]f Pn(\c n \). 

2. The mixed state p = W)W\ = E ( r$^ )\ n )( n l 
with (overbar means average over p{c)) has low pu- 
rity trp 2 << 1. 

The author [55] justify these assumptions in a qualita- 
tive way that these are valid in " practice" if not " in prin- 
ciple" for a system with large number of degrees of free- 
dom. The rigorous justification is serious lacking. The 
role of quantum chaos in these theorems is not clear at 
present. KNT is also discussed in [41] where it is termed 
as "thermodynamic normality" 

4- Dynamical Normal typicality (DNT) 

The von Neumann's quantum ergodic theorem is the 
first known theorem of dynamical normal typicality. This 
theorem is based upon the observer's inability to further 
resolve T-L u by macroscopic means and the bound on fluc- 
tuations of the quantum expectation values of the opera- 
tors from microcanonical averaging due to the structure 
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imposed on the Hilbert space by macroscopic perspective 
(the large D = dirnH and large n — #2?, see subsection 
B). 

Important point is that no chaos or disorder assump- 
tion was invoked by Neumann. In his own words [2"5] 
"...we emphasize that the true state (about which 
we do calculations) is a wavefunction, i.e., something 
microscopic — to introduce a macroscopic description of 
the state would mean to introduce disorder assumptions, 
which is what we definitely want to avoid" . His work 
is a result of "scale separation" ( i.e., observer's in- 
ability to further resolve Ti. v by macroscopic means, the 
large D — dirnH and large n = #2? etc). Spirit of his 
work resembles with that of Khinchin and more recent 
works [36l G2] , in contrast to ETH which involves quan- 
tum chaos. 

This state of affairs (un-necessity or necessity of disor- 
der) make the situation complicated and no satisfactory 
resolution is available at present. We again consider the 
role of quantum chaos (previous example was ETH). 

The important work in this direction (quantum chaos 
in the foundations of statistical mechanics) was done by 
Asher Peres in 1980's@5]. He discusses DNT based upon 
his definition of quantum chaos that the observables are 
represented by pseudorandom matrices when the Hamil- 
tonian is diagonal. Due to this, expectation values of 
these observables tend to equilibrium values and fluctu- 
ations around these equilibrium values are, on the av- 
erage, very small. His argument for quantum ergodic- 
ity goes like this; the quantum expectation value of an 
observable is (A(t)) = J2e'E" Pe'E" (E'\A\E v )e^ E '- E "^ 
for non-degenerate spectrum, time average is (A(t)) = 
J2ePeeAee (here Pee is the density matrix in energy 
representation). Here comes the Peres' argument; con- 
sider that the system has very many energy levels in the 
range E to E + AE, (1) distribution of these energy lev- 
els is very very different in regular and chaotic case [43]- 
135] , (2) the observable A EE too behave very differently 
for regular and chaotic case (regular system have selec- 
tion rules, most Aee vanish and only a few are large, 
and in chaotic case Aee is pseudorandom (his definition 
of quantum chaos) and very numerous), and (3) for a 
chaotic system, the pseudorandom Aee are statistically 
independent from pee in that energy range and their av- 
erage does not appreciably depend on the energy interval 
AE. As J2ePee = 1, one has (A(t)) = A{E). This 
he calls quantum ergodicity. The fundamental assump- 
tion used is the assumption of statistical independence of 
Aee from Pee- Then he discuss more relevant property, 
called mixing in quantum case, that the fluctuations of 
(A(t)) about its equilibrium value A{E) are, on the av- 

erage, small i.e., y/jA(^ - (JA~^ < J^fi ( N 
is the large number of energy levels in the interval E to 
E + AE) . To prove this he again used the assumption of 
statistical independence. Thus we see from Peres' pro- 
gram that DNT is a consequence of quantum chaos (ob- 
servables as pseudorandom matrices) and the assumption 



of statistical independence. 

This is in sharp contrast with the approach of von Neu- 
mann (as von Neumann did not use quantum chaos), still 
we do not have a coherent picture that when chaos is im- 
portant (Peres) and when chaos is not important (von 
Neumann)? Various typicality approaches are summa- 
rized in figure (6). 



V. SUMMARY AND OPEN ISSUES 

1. Justification of Gibbs' ensembles is a complicated 
problem. 

2. Ergodic hypothesis is not necessary for the work- 
ings of statistical mechanics as far one is con- 
cerned with sum-function observables and macro- 
scopic systems. 

3. Statistical mechanics explain thermodynamic be- 
haviour because of the property of statistical inde- 
pendence, or, more accurate approach of Khinchin 
which is the cause of typicality and canonical for- 
malism can be obtained from Landau's program 
without using any ad-hoc hypothesis except the 
property of statistical independence. 

4. The issues of chaos, integrability or non- 
integrability becomes important when one deals 
with small isolated systems, for example, 100 or 
so Bosonic or Fermionic atoms in an optical lat- 
tice. They are irrelevant for a macroscopic system 
and sum-function observables. 

5. Clearly von Neumann's quantum ergodic theorem 
and other quantum-statistical typicality theorems 
are analogous to Khinchin's self averaging theorem. 
These become more accurate for a system with 
large number of degrees of freedom and these say 
that for such systems the expectation values of the 
observables (sum-functions in Khinchin's case) stay 
close to the microcanoncal/canonical predictions. 
In principle, Khinchin's approach should come out 
as a special case of von Neumann's theory. Clearly 
much work is needed in this direction. 

6. One can contemplate classical KCT (this tradi- 
tional statistical mechanics), classical DCT, and so 
on. 

7. The above presented consolidation or amalgama- 
tion of various quantum-statistical typicality theo- 
rems singles out ETH and Peres' DNT. ETH and 
Peres' DNT involves quantum chaos, whereas oth- 
ers like von Neumann's do not involve quantum 
chaos. Still we do not have a coherent picture; 
when chaos is important (Peres) and when chaos 
is not important (von Neumann)? 
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I would like to end this manuscript with the following 
aphoristic words of Prof. Jeol Lebowitz[51j about the er- 
godic hypothesis; 

"Now it is the real time to recognize that the Ergodic 
Hypothesis is not a necessary and is not a sufficient con- 
dition for the foundation of statistical mechanics" 
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VI. APPENDEX 

A. Boltzmann's permutational argument and the 
origin of S oc InW 

Boltzmann introduced the notion of " Komplexions" of 
molecules as follows. Consider the kinetic energies of the 
molecules take the following discrete values e, 2e, 3e, ....ne, 
and let Wj number of molecules posses a kinetic energy 
3 e - T^j=i w j — N, Ylj=i(j e ) w j = E. Number of Kom- 
plexions for a given distribution is P = N\/w\\w2^----w n \. 
He noted that maximization of P with the above con- 
straints leads to Maxwellian kinetic energy distribution. 



Boltzmann then defines "degree of permutability" Q = 
logP, and finds that (by direct computation) that fl m ax 
is equal to the entropy of an ideal gas in a reversible 
process up to an additive constant. 

S = J ~Y = ^ max = max { lo 9 F }- 

It was Max Planck who gave the general foundations 
to the fundamental principle S = kslogW + c, (using 
W in place of P) introducing first time the constant ks 
called the Boltzmann constant. He proposed the fol- 
lowing fundamental hypothesis called Planck's thermo- 
dynamic probability hypothesis. 

"The entropy of physical system in a definite state de- 
pends solely on the thermodynamic probability W of this 
state", S = f(W). 

Considering two systems in equilibrium with each other. 
From second law, one has additivity of entropy S = S± + 
S 2 , thus /(WiW a ) = f(W) = f(Wi) + f(W 2 \ and then 
by differentiating one can obtain f(W) + Wf(W) = 0. 
It gives 



S = k B logW + c. 



See the little very good book by Enrico Fermi [46 . 

The important point here is that the equilibrium is 
conceived as the most probable state rather than the 
temporally constant state. But as we know that the fluc- 
tuations in observables are extremely small for macro- 
scopic systems both definitions are approximately con- 
sistent with each other. 
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